rank | frequency | n-gram |
---|---|---|
1 | 602073 | -н |
2 | 431630 | -ң |
3 | 431097 | -а |
4 | 400398 | -ы |
5 | 236704 | -і |
rank | frequency | n-gram |
---|---|---|
1 | 253967 | -ың |
2 | 228757 | -ен |
3 | 154238 | -ан |
4 | 145667 | -ің |
5 | 118520 | -ды |
rank | frequency | n-gram |
---|---|---|
1 | 119223 | -ның |
2 | 106435 | -мен |
3 | 72400 | -дың |
4 | 54600 | -лық |
5 | 53039 | -нің |
rank | frequency | n-gram |
---|---|---|
1 | 51449 | -ының |
2 | 42316 | -рдың |
3 | 30592 | -дағы |
4 | 27554 | -аның |
5 | 27190 | -рдің |
rank | frequency | n-gram |
---|---|---|
1 | 37963 | -ардың |
2 | 24866 | -ердің |
3 | 13776 | -иялық |
4 | 13726 | -рының |
5 | 12728 | -овтың |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings